Exploratory Data Analysis¶

In [1]:
from warnings import filterwarnings
filterwarnings('ignore')

Step 1: Read the Data Set¶

In [2]:
import pandas as pd
df = pd.read_csv('training_set.csv')
df.head()
Out[2]:
Id MSSubClass MSZoning LotFrontage LotArea Street Alley LotShape LandContour Utilities ... PoolArea PoolQC Fence MiscFeature MiscVal MoSold YrSold SaleType SaleCondition SalePrice
0 1 60 RL 65.0 8450 Pave NaN Reg Lvl AllPub ... 0 NaN NaN NaN 0 2 2008 WD Normal 208500
1 2 20 RL 80.0 9600 Pave NaN Reg Lvl AllPub ... 0 NaN NaN NaN 0 5 2007 WD Normal 181500
2 3 60 RL 68.0 11250 Pave NaN IR1 Lvl AllPub ... 0 NaN NaN NaN 0 9 2008 WD Normal 223500
3 4 70 RL 60.0 9550 Pave NaN IR1 Lvl AllPub ... 0 NaN NaN NaN 0 2 2006 WD Abnorml 140000
4 5 60 RL 84.0 14260 Pave NaN IR1 Lvl AllPub ... 0 NaN NaN NaN 0 12 2008 WD Normal 250000

5 rows × 81 columns

Step 2: Perform Basic Data Quality Checking¶

In [3]:
df.info()
<class 'pandas.core.frame.DataFrame'>
RangeIndex: 1460 entries, 0 to 1459
Data columns (total 81 columns):
 #   Column         Non-Null Count  Dtype  
---  ------         --------------  -----  
 0   Id             1460 non-null   int64  
 1   MSSubClass     1460 non-null   int64  
 2   MSZoning       1460 non-null   object 
 3   LotFrontage    1201 non-null   float64
 4   LotArea        1460 non-null   int64  
 5   Street         1460 non-null   object 
 6   Alley          91 non-null     object 
 7   LotShape       1460 non-null   object 
 8   LandContour    1460 non-null   object 
 9   Utilities      1460 non-null   object 
 10  LotConfig      1460 non-null   object 
 11  LandSlope      1460 non-null   object 
 12  Neighborhood   1460 non-null   object 
 13  Condition1     1460 non-null   object 
 14  Condition2     1460 non-null   object 
 15  BldgType       1460 non-null   object 
 16  HouseStyle     1460 non-null   object 
 17  OverallQual    1460 non-null   int64  
 18  OverallCond    1460 non-null   int64  
 19  YearBuilt      1460 non-null   int64  
 20  YearRemodAdd   1460 non-null   int64  
 21  RoofStyle      1460 non-null   object 
 22  RoofMatl       1460 non-null   object 
 23  Exterior1st    1460 non-null   object 
 24  Exterior2nd    1460 non-null   object 
 25  MasVnrType     588 non-null    object 
 26  MasVnrArea     1452 non-null   float64
 27  ExterQual      1460 non-null   object 
 28  ExterCond      1460 non-null   object 
 29  Foundation     1460 non-null   object 
 30  BsmtQual       1423 non-null   object 
 31  BsmtCond       1423 non-null   object 
 32  BsmtExposure   1422 non-null   object 
 33  BsmtFinType1   1423 non-null   object 
 34  BsmtFinSF1     1460 non-null   int64  
 35  BsmtFinType2   1422 non-null   object 
 36  BsmtFinSF2     1460 non-null   int64  
 37  BsmtUnfSF      1460 non-null   int64  
 38  TotalBsmtSF    1460 non-null   int64  
 39  Heating        1460 non-null   object 
 40  HeatingQC      1460 non-null   object 
 41  CentralAir     1460 non-null   object 
 42  Electrical     1459 non-null   object 
 43  1stFlrSF       1460 non-null   int64  
 44  2ndFlrSF       1460 non-null   int64  
 45  LowQualFinSF   1460 non-null   int64  
 46  GrLivArea      1460 non-null   int64  
 47  BsmtFullBath   1460 non-null   int64  
 48  BsmtHalfBath   1460 non-null   int64  
 49  FullBath       1460 non-null   int64  
 50  HalfBath       1460 non-null   int64  
 51  BedroomAbvGr   1460 non-null   int64  
 52  KitchenAbvGr   1460 non-null   int64  
 53  KitchenQual    1460 non-null   object 
 54  TotRmsAbvGrd   1460 non-null   int64  
 55  Functional     1460 non-null   object 
 56  Fireplaces     1460 non-null   int64  
 57  FireplaceQu    770 non-null    object 
 58  GarageType     1379 non-null   object 
 59  GarageYrBlt    1379 non-null   float64
 60  GarageFinish   1379 non-null   object 
 61  GarageCars     1460 non-null   int64  
 62  GarageArea     1460 non-null   int64  
 63  GarageQual     1379 non-null   object 
 64  GarageCond     1379 non-null   object 
 65  PavedDrive     1460 non-null   object 
 66  WoodDeckSF     1460 non-null   int64  
 67  OpenPorchSF    1460 non-null   int64  
 68  EnclosedPorch  1460 non-null   int64  
 69  3SsnPorch      1460 non-null   int64  
 70  ScreenPorch    1460 non-null   int64  
 71  PoolArea       1460 non-null   int64  
 72  PoolQC         7 non-null      object 
 73  Fence          281 non-null    object 
 74  MiscFeature    54 non-null     object 
 75  MiscVal        1460 non-null   int64  
 76  MoSold         1460 non-null   int64  
 77  YrSold         1460 non-null   int64  
 78  SaleType       1460 non-null   object 
 79  SaleCondition  1460 non-null   object 
 80  SalePrice      1460 non-null   int64  
dtypes: float64(3), int64(35), object(43)
memory usage: 924.0+ KB
In [4]:
# create a variable 'm' for finding missing values in data set
m = df.isna().sum() 
m[m>0]
Out[4]:
LotFrontage      259
Alley           1369
MasVnrType       872
MasVnrArea         8
BsmtQual          37
BsmtCond          37
BsmtExposure      38
BsmtFinType1      37
BsmtFinType2      38
Electrical         1
FireplaceQu      690
GarageType        81
GarageYrBlt       81
GarageFinish      81
GarageQual        81
GarageCond        81
PoolQC          1453
Fence           1179
MiscFeature     1406
dtype: int64
In [5]:
# Finding the Duplicated values in Data set
df.duplicated().sum()
Out[5]:
0

Step 3: Categorical and Contineous Variable Separation¶

i.e. Descriptive Analytics

In [6]:
# Firstly we drop the 'ID' column from the Dataset Cause it will not affect any further Results
df = df.drop(columns=['Id'])
df.head() 
Out[6]:
MSSubClass MSZoning LotFrontage LotArea Street Alley LotShape LandContour Utilities LotConfig ... PoolArea PoolQC Fence MiscFeature MiscVal MoSold YrSold SaleType SaleCondition SalePrice
0 60 RL 65.0 8450 Pave NaN Reg Lvl AllPub Inside ... 0 NaN NaN NaN 0 2 2008 WD Normal 208500
1 20 RL 80.0 9600 Pave NaN Reg Lvl AllPub FR2 ... 0 NaN NaN NaN 0 5 2007 WD Normal 181500
2 60 RL 68.0 11250 Pave NaN IR1 Lvl AllPub Inside ... 0 NaN NaN NaN 0 9 2008 WD Normal 223500
3 70 RL 60.0 9550 Pave NaN IR1 Lvl AllPub Corner ... 0 NaN NaN NaN 0 2 2006 WD Abnorml 140000
4 60 RL 84.0 14260 Pave NaN IR1 Lvl AllPub FR2 ... 0 NaN NaN NaN 0 12 2008 WD Normal 250000

5 rows × 80 columns

In [7]:
cat = list(df.columns[df.dtypes=='object'])
con = list(df.columns[df.dtypes!='object'])
In [8]:
cat
Out[8]:
['MSZoning',
 'Street',
 'Alley',
 'LotShape',
 'LandContour',
 'Utilities',
 'LotConfig',
 'LandSlope',
 'Neighborhood',
 'Condition1',
 'Condition2',
 'BldgType',
 'HouseStyle',
 'RoofStyle',
 'RoofMatl',
 'Exterior1st',
 'Exterior2nd',
 'MasVnrType',
 'ExterQual',
 'ExterCond',
 'Foundation',
 'BsmtQual',
 'BsmtCond',
 'BsmtExposure',
 'BsmtFinType1',
 'BsmtFinType2',
 'Heating',
 'HeatingQC',
 'CentralAir',
 'Electrical',
 'KitchenQual',
 'Functional',
 'FireplaceQu',
 'GarageType',
 'GarageFinish',
 'GarageQual',
 'GarageCond',
 'PavedDrive',
 'PoolQC',
 'Fence',
 'MiscFeature',
 'SaleType',
 'SaleCondition']
In [9]:
con
Out[9]:
['MSSubClass',
 'LotFrontage',
 'LotArea',
 'OverallQual',
 'OverallCond',
 'YearBuilt',
 'YearRemodAdd',
 'MasVnrArea',
 'BsmtFinSF1',
 'BsmtFinSF2',
 'BsmtUnfSF',
 'TotalBsmtSF',
 '1stFlrSF',
 '2ndFlrSF',
 'LowQualFinSF',
 'GrLivArea',
 'BsmtFullBath',
 'BsmtHalfBath',
 'FullBath',
 'HalfBath',
 'BedroomAbvGr',
 'KitchenAbvGr',
 'TotRmsAbvGrd',
 'Fireplaces',
 'GarageYrBlt',
 'GarageCars',
 'GarageArea',
 'WoodDeckSF',
 'OpenPorchSF',
 'EnclosedPorch',
 '3SsnPorch',
 'ScreenPorch',
 'PoolArea',
 'MiscVal',
 'MoSold',
 'YrSold',
 'SalePrice']
In [10]:
df[cat].describe().T 
# Describe function gives deascription of data on DataFrame
Out[10]:
count unique top freq
MSZoning 1460 5 RL 1151
Street 1460 2 Pave 1454
Alley 91 2 Grvl 50
LotShape 1460 4 Reg 925
LandContour 1460 4 Lvl 1311
Utilities 1460 2 AllPub 1459
LotConfig 1460 5 Inside 1052
LandSlope 1460 3 Gtl 1382
Neighborhood 1460 25 NAmes 225
Condition1 1460 9 Norm 1260
Condition2 1460 8 Norm 1445
BldgType 1460 5 1Fam 1220
HouseStyle 1460 8 1Story 726
RoofStyle 1460 6 Gable 1141
RoofMatl 1460 8 CompShg 1434
Exterior1st 1460 15 VinylSd 515
Exterior2nd 1460 16 VinylSd 504
MasVnrType 588 3 BrkFace 445
ExterQual 1460 4 TA 906
ExterCond 1460 5 TA 1282
Foundation 1460 6 PConc 647
BsmtQual 1423 4 TA 649
BsmtCond 1423 4 TA 1311
BsmtExposure 1422 4 No 953
BsmtFinType1 1423 6 Unf 430
BsmtFinType2 1422 6 Unf 1256
Heating 1460 6 GasA 1428
HeatingQC 1460 5 Ex 741
CentralAir 1460 2 Y 1365
Electrical 1459 5 SBrkr 1334
KitchenQual 1460 4 TA 735
Functional 1460 7 Typ 1360
FireplaceQu 770 5 Gd 380
GarageType 1379 6 Attchd 870
GarageFinish 1379 3 Unf 605
GarageQual 1379 5 TA 1311
GarageCond 1379 5 TA 1326
PavedDrive 1460 3 Y 1340
PoolQC 7 3 Gd 3
Fence 281 4 MnPrv 157
MiscFeature 54 4 Shed 49
SaleType 1460 9 WD 1267
SaleCondition 1460 6 Normal 1198
In [11]:
df['MSZoning'].value_counts()
Out[11]:
MSZoning
RL         1151
RM          218
FV           65
RH           16
C (all)      10
Name: count, dtype: int64
In [12]:
df[con].describe().T
Out[12]:
count mean std min 25% 50% 75% max
MSSubClass 1460.0 56.897260 42.300571 20.0 20.00 50.0 70.00 190.0
LotFrontage 1201.0 70.049958 24.284752 21.0 59.00 69.0 80.00 313.0
LotArea 1460.0 10516.828082 9981.264932 1300.0 7553.50 9478.5 11601.50 215245.0
OverallQual 1460.0 6.099315 1.382997 1.0 5.00 6.0 7.00 10.0
OverallCond 1460.0 5.575342 1.112799 1.0 5.00 5.0 6.00 9.0
YearBuilt 1460.0 1971.267808 30.202904 1872.0 1954.00 1973.0 2000.00 2010.0
YearRemodAdd 1460.0 1984.865753 20.645407 1950.0 1967.00 1994.0 2004.00 2010.0
MasVnrArea 1452.0 103.685262 181.066207 0.0 0.00 0.0 166.00 1600.0
BsmtFinSF1 1460.0 443.639726 456.098091 0.0 0.00 383.5 712.25 5644.0
BsmtFinSF2 1460.0 46.549315 161.319273 0.0 0.00 0.0 0.00 1474.0
BsmtUnfSF 1460.0 567.240411 441.866955 0.0 223.00 477.5 808.00 2336.0
TotalBsmtSF 1460.0 1057.429452 438.705324 0.0 795.75 991.5 1298.25 6110.0
1stFlrSF 1460.0 1162.626712 386.587738 334.0 882.00 1087.0 1391.25 4692.0
2ndFlrSF 1460.0 346.992466 436.528436 0.0 0.00 0.0 728.00 2065.0
LowQualFinSF 1460.0 5.844521 48.623081 0.0 0.00 0.0 0.00 572.0
GrLivArea 1460.0 1515.463699 525.480383 334.0 1129.50 1464.0 1776.75 5642.0
BsmtFullBath 1460.0 0.425342 0.518911 0.0 0.00 0.0 1.00 3.0
BsmtHalfBath 1460.0 0.057534 0.238753 0.0 0.00 0.0 0.00 2.0
FullBath 1460.0 1.565068 0.550916 0.0 1.00 2.0 2.00 3.0
HalfBath 1460.0 0.382877 0.502885 0.0 0.00 0.0 1.00 2.0
BedroomAbvGr 1460.0 2.866438 0.815778 0.0 2.00 3.0 3.00 8.0
KitchenAbvGr 1460.0 1.046575 0.220338 0.0 1.00 1.0 1.00 3.0
TotRmsAbvGrd 1460.0 6.517808 1.625393 2.0 5.00 6.0 7.00 14.0
Fireplaces 1460.0 0.613014 0.644666 0.0 0.00 1.0 1.00 3.0
GarageYrBlt 1379.0 1978.506164 24.689725 1900.0 1961.00 1980.0 2002.00 2010.0
GarageCars 1460.0 1.767123 0.747315 0.0 1.00 2.0 2.00 4.0
GarageArea 1460.0 472.980137 213.804841 0.0 334.50 480.0 576.00 1418.0
WoodDeckSF 1460.0 94.244521 125.338794 0.0 0.00 0.0 168.00 857.0
OpenPorchSF 1460.0 46.660274 66.256028 0.0 0.00 25.0 68.00 547.0
EnclosedPorch 1460.0 21.954110 61.119149 0.0 0.00 0.0 0.00 552.0
3SsnPorch 1460.0 3.409589 29.317331 0.0 0.00 0.0 0.00 508.0
ScreenPorch 1460.0 15.060959 55.757415 0.0 0.00 0.0 0.00 480.0
PoolArea 1460.0 2.758904 40.177307 0.0 0.00 0.0 0.00 738.0
MiscVal 1460.0 43.489041 496.123024 0.0 0.00 0.0 0.00 15500.0
MoSold 1460.0 6.321918 2.703626 1.0 5.00 6.0 8.00 12.0
YrSold 1460.0 2007.815753 1.328095 2006.0 2007.00 2008.0 2009.00 2010.0
SalePrice 1460.0 180921.195890 79442.502883 34900.0 129975.00 163000.0 214000.00 755000.0

Step 4: Data Visuals¶

Types od EDA:

  1. Univariate Analysis
  2. Bivariate Abalysis
  3. Multivariate Analysis

Univariate Analyasis¶

In [13]:
# Importing the Data Visualization libraries 
import matplotlib.pyplot as plt
import seaborn as sns
In [14]:
# Plotting Countplot for categorical Features 
for i in cat:
    plt.figure(figsize=(12,6))
    sns.countplot(data=df, x=i)
    plt.title(f'Countplot of {i}')
    plt.show()
In [15]:
# Plotting Histogram for contineous Features
for i in con:
    plt.figure(figsize=(12,6))
    sns.histplot(data=df, x=i, kde=True)
    plt.title(f'Histogram for {i}')
    plt.show()

Bivariate Analysis¶

Types of Bivariate Analysis:

  1. con vs con = Scatterplots
  2. cat vs con = Box Plots
  3. Cat vs cat = Crosstab Heatmap
In [16]:
con
Out[16]:
['MSSubClass',
 'LotFrontage',
 'LotArea',
 'OverallQual',
 'OverallCond',
 'YearBuilt',
 'YearRemodAdd',
 'MasVnrArea',
 'BsmtFinSF1',
 'BsmtFinSF2',
 'BsmtUnfSF',
 'TotalBsmtSF',
 '1stFlrSF',
 '2ndFlrSF',
 'LowQualFinSF',
 'GrLivArea',
 'BsmtFullBath',
 'BsmtHalfBath',
 'FullBath',
 'HalfBath',
 'BedroomAbvGr',
 'KitchenAbvGr',
 'TotRmsAbvGrd',
 'Fireplaces',
 'GarageYrBlt',
 'GarageCars',
 'GarageArea',
 'WoodDeckSF',
 'OpenPorchSF',
 'EnclosedPorch',
 '3SsnPorch',
 'ScreenPorch',
 'PoolArea',
 'MiscVal',
 'MoSold',
 'YrSold',
 'SalePrice']
In [17]:
# Plotting Scatterplots for con vs con
for i in con:
    if i!= 'SalePrice':
        plt.figure(figsize=(12,6))
        sns.scatterplot(data=df, x=i, y='SalePrice')
        plt.title(f'Scatterplot for {i} vs SalePrice')
        plt.show()
In [18]:
cat
Out[18]:
['MSZoning',
 'Street',
 'Alley',
 'LotShape',
 'LandContour',
 'Utilities',
 'LotConfig',
 'LandSlope',
 'Neighborhood',
 'Condition1',
 'Condition2',
 'BldgType',
 'HouseStyle',
 'RoofStyle',
 'RoofMatl',
 'Exterior1st',
 'Exterior2nd',
 'MasVnrType',
 'ExterQual',
 'ExterCond',
 'Foundation',
 'BsmtQual',
 'BsmtCond',
 'BsmtExposure',
 'BsmtFinType1',
 'BsmtFinType2',
 'Heating',
 'HeatingQC',
 'CentralAir',
 'Electrical',
 'KitchenQual',
 'Functional',
 'FireplaceQu',
 'GarageType',
 'GarageFinish',
 'GarageQual',
 'GarageCond',
 'PavedDrive',
 'PoolQC',
 'Fence',
 'MiscFeature',
 'SaleType',
 'SaleCondition']
In [19]:
# Plotting Boxplot for cat vs con
for i in cat:
    plt.figure(figsize=(12,6))
    sns.boxplot(data=df, x=i, y='SalePrice')
    plt.title(f'Boxplot for {i} vs SalePrice')
    plt.show()
In [20]:
cat
Out[20]:
['MSZoning',
 'Street',
 'Alley',
 'LotShape',
 'LandContour',
 'Utilities',
 'LotConfig',
 'LandSlope',
 'Neighborhood',
 'Condition1',
 'Condition2',
 'BldgType',
 'HouseStyle',
 'RoofStyle',
 'RoofMatl',
 'Exterior1st',
 'Exterior2nd',
 'MasVnrType',
 'ExterQual',
 'ExterCond',
 'Foundation',
 'BsmtQual',
 'BsmtCond',
 'BsmtExposure',
 'BsmtFinType1',
 'BsmtFinType2',
 'Heating',
 'HeatingQC',
 'CentralAir',
 'Electrical',
 'KitchenQual',
 'Functional',
 'FireplaceQu',
 'GarageType',
 'GarageFinish',
 'GarageQual',
 'GarageCond',
 'PavedDrive',
 'PoolQC',
 'Fence',
 'MiscFeature',
 'SaleType',
 'SaleCondition']
In [21]:
# Plotting Crosstab Heatmap for cat vs cat
ctab1 = pd.crosstab(df['Condition1'], df['Condition2'])
ctab1
Out[21]:
Condition2 Artery Feedr Norm PosA PosN RRAe RRAn RRNn
Condition1
Artery 2 0 45 1 0 0 0 0
Feedr 0 1 76 0 0 1 1 2
Norm 0 0 1260 0 0 0 0 0
PosA 0 0 8 0 0 0 0 0
PosN 0 0 17 0 2 0 0 0
RRAe 0 0 11 0 0 0 0 0
RRAn 0 4 22 0 0 0 0 0
RRNe 0 0 2 0 0 0 0 0
RRNn 0 1 4 0 0 0 0 0
In [22]:
sns.heatmap(ctab1, annot=True, fmt='d')
Out[22]:
<Axes: xlabel='Condition2', ylabel='Condition1'>

Multivariate Analysis¶

Types of Multivariate Analysis:

  1. Correlation Heatmap
  2. Pairplot
In [23]:
# Plotting Correlation Heatmap
cor = df.corr(numeric_only=True)
cor
Out[23]:
MSSubClass LotFrontage LotArea OverallQual OverallCond YearBuilt YearRemodAdd MasVnrArea BsmtFinSF1 BsmtFinSF2 ... WoodDeckSF OpenPorchSF EnclosedPorch 3SsnPorch ScreenPorch PoolArea MiscVal MoSold YrSold SalePrice
MSSubClass 1.000000 -0.386347 -0.139781 0.032628 -0.059316 0.027850 0.040581 0.022936 -0.069836 -0.065649 ... -0.012579 -0.006100 -0.012037 -0.043825 -0.026030 0.008283 -0.007683 -0.013585 -0.021407 -0.084284
LotFrontage -0.386347 1.000000 0.426095 0.251646 -0.059213 0.123349 0.088866 0.193458 0.233633 0.049900 ... 0.088521 0.151972 0.010700 0.070029 0.041383 0.206167 0.003368 0.011200 0.007450 0.351799
LotArea -0.139781 0.426095 1.000000 0.105806 -0.005636 0.014228 0.013788 0.104160 0.214103 0.111170 ... 0.171698 0.084774 -0.018340 0.020423 0.043160 0.077672 0.038068 0.001205 -0.014261 0.263843
OverallQual 0.032628 0.251646 0.105806 1.000000 -0.091932 0.572323 0.550684 0.411876 0.239666 -0.059119 ... 0.238923 0.308819 -0.113937 0.030371 0.064886 0.065166 -0.031406 0.070815 -0.027347 0.790982
OverallCond -0.059316 -0.059213 -0.005636 -0.091932 1.000000 -0.375983 0.073741 -0.128101 -0.046231 0.040229 ... -0.003334 -0.032589 0.070356 0.025504 0.054811 -0.001985 0.068777 -0.003511 0.043950 -0.077856
YearBuilt 0.027850 0.123349 0.014228 0.572323 -0.375983 1.000000 0.592855 0.315707 0.249503 -0.049107 ... 0.224880 0.188686 -0.387268 0.031355 -0.050364 0.004950 -0.034383 0.012398 -0.013618 0.522897
YearRemodAdd 0.040581 0.088866 0.013788 0.550684 0.073741 0.592855 1.000000 0.179618 0.128451 -0.067759 ... 0.205726 0.226298 -0.193919 0.045286 -0.038740 0.005829 -0.010286 0.021490 0.035743 0.507101
MasVnrArea 0.022936 0.193458 0.104160 0.411876 -0.128101 0.315707 0.179618 1.000000 0.264736 -0.072319 ... 0.159718 0.125703 -0.110204 0.018796 0.061466 0.011723 -0.029815 -0.005965 -0.008201 0.477493
BsmtFinSF1 -0.069836 0.233633 0.214103 0.239666 -0.046231 0.249503 0.128451 0.264736 1.000000 -0.050117 ... 0.204306 0.111761 -0.102303 0.026451 0.062021 0.140491 0.003571 -0.015727 0.014359 0.386420
BsmtFinSF2 -0.065649 0.049900 0.111170 -0.059119 0.040229 -0.049107 -0.067759 -0.072319 -0.050117 1.000000 ... 0.067898 0.003093 0.036543 -0.029993 0.088871 0.041709 0.004940 -0.015211 0.031706 -0.011378
BsmtUnfSF -0.140759 0.132644 -0.002618 0.308159 -0.136841 0.149040 0.181133 0.114442 -0.495251 -0.209294 ... -0.005316 0.129005 -0.002538 0.020764 -0.012579 -0.035092 -0.023837 0.034888 -0.041258 0.214479
TotalBsmtSF -0.238518 0.392075 0.260833 0.537808 -0.171098 0.391452 0.291066 0.363936 0.522396 0.104810 ... 0.232019 0.247264 -0.095478 0.037384 0.084489 0.126053 -0.018479 0.013196 -0.014969 0.613581
1stFlrSF -0.251758 0.457181 0.299475 0.476224 -0.144203 0.281986 0.240379 0.344501 0.445863 0.097117 ... 0.235459 0.211671 -0.065292 0.056104 0.088758 0.131525 -0.021096 0.031372 -0.013604 0.605852
2ndFlrSF 0.307886 0.080177 0.050986 0.295493 0.028942 0.010308 0.140024 0.174561 -0.137079 -0.099260 ... 0.092165 0.208026 0.061989 -0.024358 0.040606 0.081487 0.016197 0.035164 -0.028700 0.319334
LowQualFinSF 0.046474 0.038469 0.004779 -0.030429 0.025494 -0.183784 -0.062419 -0.069071 -0.064503 0.014807 ... -0.025444 0.018251 0.061081 -0.004296 0.026799 0.062157 -0.003793 -0.022174 -0.028921 -0.025606
GrLivArea 0.074853 0.402797 0.263116 0.593007 -0.079686 0.199010 0.287389 0.390857 0.208171 -0.009640 ... 0.247433 0.330224 0.009113 0.020643 0.101510 0.170205 -0.002416 0.050240 -0.036526 0.708624
BsmtFullBath 0.003491 0.100949 0.158155 0.111098 -0.054942 0.187599 0.119470 0.085310 0.649212 0.158678 ... 0.175315 0.067341 -0.049911 -0.000106 0.023148 0.067616 -0.023047 -0.025361 0.067049 0.227122
BsmtHalfBath -0.002333 -0.007234 0.048046 -0.040150 0.117821 -0.038162 -0.012337 0.026673 0.067418 0.070948 ... 0.040161 -0.025324 -0.008555 0.035114 0.032121 0.020025 -0.007367 0.032873 -0.046524 -0.016844
FullBath 0.131608 0.198769 0.126031 0.550600 -0.194149 0.468271 0.439046 0.276833 0.058543 -0.076444 ... 0.187703 0.259977 -0.115093 0.035353 -0.008106 0.049604 -0.014290 0.055872 -0.019669 0.560664
HalfBath 0.177354 0.053532 0.014259 0.273458 -0.060769 0.242656 0.183331 0.201444 0.004262 -0.032148 ... 0.108080 0.199740 -0.095317 -0.004972 0.072426 0.022381 0.001290 -0.009050 -0.010269 0.284108
BedroomAbvGr -0.023438 0.263170 0.119690 0.101676 0.012980 -0.070651 -0.040581 0.102821 -0.107355 -0.015728 ... 0.046854 0.093810 0.041570 -0.024478 0.044300 0.070703 0.007767 0.046544 -0.036014 0.168213
KitchenAbvGr 0.281721 -0.006069 -0.017784 -0.183882 -0.087001 -0.174800 -0.149598 -0.037610 -0.081007 -0.040751 ... -0.090130 -0.070091 0.037312 -0.024600 -0.051613 -0.014525 0.062341 0.026589 0.031687 -0.135907
TotRmsAbvGrd 0.040380 0.352096 0.190015 0.427452 -0.057583 0.095589 0.191740 0.280682 0.044316 -0.035227 ... 0.165984 0.234192 0.004151 -0.006683 0.059383 0.083757 0.024763 0.036907 -0.034516 0.533723
Fireplaces -0.045569 0.266639 0.271364 0.396765 -0.023820 0.147716 0.112581 0.249070 0.260011 0.046921 ... 0.200019 0.169405 -0.024822 0.011257 0.184530 0.095074 0.001409 0.046357 -0.024096 0.466929
GarageYrBlt 0.085072 0.070250 -0.024947 0.547766 -0.324297 0.825667 0.642277 0.252691 0.153484 -0.088011 ... 0.224577 0.228425 -0.297003 0.023544 -0.075418 -0.014501 -0.032417 0.005337 -0.001014 0.486362
GarageCars -0.040110 0.285691 0.154871 0.600671 -0.185758 0.537850 0.420622 0.364204 0.224054 -0.038264 ... 0.226342 0.213569 -0.151434 0.035765 0.050494 0.020934 -0.043080 0.040522 -0.039117 0.640409
GarageArea -0.098672 0.344997 0.180403 0.562022 -0.151521 0.478954 0.371600 0.373066 0.296970 -0.018227 ... 0.224666 0.241435 -0.121777 0.035087 0.051412 0.061047 -0.027400 0.027974 -0.027378 0.623431
WoodDeckSF -0.012579 0.088521 0.171698 0.238923 -0.003334 0.224880 0.205726 0.159718 0.204306 0.067898 ... 1.000000 0.058661 -0.125989 -0.032771 -0.074181 0.073378 -0.009551 0.021011 0.022270 0.324413
OpenPorchSF -0.006100 0.151972 0.084774 0.308819 -0.032589 0.188686 0.226298 0.125703 0.111761 0.003093 ... 0.058661 1.000000 -0.093079 -0.005842 0.074304 0.060762 -0.018584 0.071255 -0.057619 0.315856
EnclosedPorch -0.012037 0.010700 -0.018340 -0.113937 0.070356 -0.387268 -0.193919 -0.110204 -0.102303 0.036543 ... -0.125989 -0.093079 1.000000 -0.037305 -0.082864 0.054203 0.018361 -0.028887 -0.009916 -0.128578
3SsnPorch -0.043825 0.070029 0.020423 0.030371 0.025504 0.031355 0.045286 0.018796 0.026451 -0.029993 ... -0.032771 -0.005842 -0.037305 1.000000 -0.031436 -0.007992 0.000354 0.029474 0.018645 0.044584
ScreenPorch -0.026030 0.041383 0.043160 0.064886 0.054811 -0.050364 -0.038740 0.061466 0.062021 0.088871 ... -0.074181 0.074304 -0.082864 -0.031436 1.000000 0.051307 0.031946 0.023217 0.010694 0.111447
PoolArea 0.008283 0.206167 0.077672 0.065166 -0.001985 0.004950 0.005829 0.011723 0.140491 0.041709 ... 0.073378 0.060762 0.054203 -0.007992 0.051307 1.000000 0.029669 -0.033737 -0.059689 0.092404
MiscVal -0.007683 0.003368 0.038068 -0.031406 0.068777 -0.034383 -0.010286 -0.029815 0.003571 0.004940 ... -0.009551 -0.018584 0.018361 0.000354 0.031946 0.029669 1.000000 -0.006495 0.004906 -0.021190
MoSold -0.013585 0.011200 0.001205 0.070815 -0.003511 0.012398 0.021490 -0.005965 -0.015727 -0.015211 ... 0.021011 0.071255 -0.028887 0.029474 0.023217 -0.033737 -0.006495 1.000000 -0.145721 0.046432
YrSold -0.021407 0.007450 -0.014261 -0.027347 0.043950 -0.013618 0.035743 -0.008201 0.014359 0.031706 ... 0.022270 -0.057619 -0.009916 0.018645 0.010694 -0.059689 0.004906 -0.145721 1.000000 -0.028923
SalePrice -0.084284 0.351799 0.263843 0.790982 -0.077856 0.522897 0.507101 0.477493 0.386420 -0.011378 ... 0.324413 0.315856 -0.128578 0.044584 0.111447 0.092404 -0.021190 0.046432 -0.028923 1.000000

37 rows × 37 columns

In [24]:
plt.figure(figsize=(30,30))
sns.heatmap(cor, annot=True, fmt='.2f')
plt.show()
In [ ]: